SetUp

Data Source: https://www.kaggle.com/praveengovi/emotions-dataset-for-nlp

Imports

Data Imports

Seems like the test and validation dataset have the same data.

Basic EDA

How much data?

Missing Values

Data Distribution by Emotions

Data Preparation

Label Encoding

One Hot Encoding and Train Test Data Prep

Generate Word Embeddings

Analyze & select max_sequence_length

or use Embedding Projector by Google - http://projector.tensorflow.org/

Simply upload your model and visualize the learned embeddings. Much better than this diagram.

Gensim API - https://radimrehurek.com/gensim/models/word2vec.html

Text to Sequences using Gensim

Standardize Sequence Length as Selected Earlier

Multiclass Classifier